115 research outputs found
On the Optimization of Deep Networks: Implicit Acceleration by Overparameterization
Conventional wisdom in deep learning states that increasing depth improves
expressiveness but complicates optimization. This paper suggests that,
sometimes, increasing depth can speed up optimization. The effect of depth on
optimization is decoupled from expressiveness by focusing on settings where
additional layers amount to overparameterization - linear neural networks, a
well-studied model. Theoretical analysis, as well as experiments, show that
here depth acts as a preconditioner which may accelerate convergence. Even on
simple convex problems such as linear regression with loss, ,
gradient descent can benefit from transitioning to a non-convex
overparameterized objective, more than it would from some common acceleration
schemes. We also prove that it is mathematically impossible to obtain the
acceleration effect of overparametrization via gradients of any regularizer.Comment: Published at the International Conference on Machine Learning (ICML)
201
Deep SimNets
We present a deep layered architecture that generalizes convolutional neural
networks (ConvNets). The architecture, called SimNets, is driven by two
operators: (i) a similarity function that generalizes inner-product, and (ii) a
log-mean-exp function called MEX that generalizes maximum and average. The two
operators applied in succession give rise to a standard neuron but in "feature
space". The feature spaces realized by SimNets depend on the choice of the
similarity operator. The simplest setting, which corresponds to a convolution,
realizes the feature space of the Exponential kernel, while other settings
realize feature spaces of more powerful kernels (Generalized Gaussian, which
includes as special cases RBF and Laplacian), or even dynamically learned
feature spaces (Generalized Multiple Kernel Learning). As a result, the SimNet
contains a higher abstraction level compared to a traditional ConvNet. We argue
that enhanced expressiveness is important when the networks are small due to
run-time constraints (such as those imposed by mobile applications). Empirical
evaluation validates the superior expressiveness of SimNets, showing a
significant gain in accuracy over ConvNets when computational resources at
run-time are limited. We also show that in large-scale settings, where
computational complexity is less of a concern, the additional capacity of
SimNets can be controlled with proper regularization, yielding accuracies
comparable to state of the art ConvNets
"Zero-Shot" Super-Resolution using Deep Internal Learning
Deep Learning has led to a dramatic leap in Super-Resolution (SR) performance
in the past few years. However, being supervised, these SR methods are
restricted to specific training data, where the acquisition of the
low-resolution (LR) images from their high-resolution (HR) counterparts is
predetermined (e.g., bicubic downscaling), without any distracting artifacts
(e.g., sensor noise, image compression, non-ideal PSF, etc). Real LR images,
however, rarely obey these restrictions, resulting in poor SR results by SotA
(State of the Art) methods. In this paper we introduce "Zero-Shot" SR, which
exploits the power of Deep Learning, but does not rely on prior training. We
exploit the internal recurrence of information inside a single image, and train
a small image-specific CNN at test time, on examples extracted solely from the
input image itself. As such, it can adapt itself to different settings per
image. This allows to perform SR of real old photos, noisy images, biological
data, and other images where the acquisition process is unknown or non-ideal.
On such images, our method outperforms SotA CNN-based SR methods, as well as
previous unsupervised SR methods. To the best of our knowledge, this is the
first unsupervised CNN-based SR method
- …